Snowflake Analytics — Usage Guide
Legion analytics data is replicated from MongoDB into Snowflake via DMS and Snowpipe. This doc covers how to access, explore, query, and build dashboards.
Table of Contents
- Access
- Inviting New Users
- Data Architecture
- Exploring Data (Data Preview)
- Canonical Organization ID
- Querying
- Dashboards
- Maintaining the Schema
- Infrastructure Reference
Access
- Go to app.snowflake.com and sign in
- Select the LEGION_ANALYTICS database from the left sidebar
Inviting New Users
Users with the ACCOUNTADMIN or USERADMIN role can invite new users to Snowflake.
- Go to Governance & Security → Users tab
- Click Invite
- Enter the user's Legion email address
- Assign the
SYSADMINrole - Send the invite
After accepting the invite, the new user should enable MFA on their account.
Data Architecture
Two schemas are relevant:
| Schema | Purpose |
|---|---|
RAW_CLEAN | Pipeline objects — raw tables, deduplication, pipe, stage. Do not query directly unless debugging ingestion. |
ANALYTICS | Views for everyday querying — start here. |
┌─────────────────────────────────────────────────────────────────┐
│ RAW_CLEAN schema │
│ │
│ S3/GCS Stage │
│ │ │
│ │ PIPE_EVENTS (Snowpipe — auto-ingests JSON files) │
│ ▼ │
│ RAW_EVENTS append-only, one row per DMS event │
│ │ │
│ │ deduplication (keep latest per object, every 1h) │
│ ▼ │
│ EVENTS_DEDUPED latest snapshot per (type, id, region) │
│ │
└──────────────────────────────┬──────────────────────────────────┘
│ views read from EVENTS_DEDUPED
▼
┌─────────────────────────────────────────────────────────────────┐
│ ANALYTICS schema │
│ │
│ AUDIT_V audit events (recordings, runs) │
│ WORKFLOWS_V workflow records │
│ WORKFLOW_GRAPHS_V workflow graph versions │
│ USECASES_V use case / template records │
│ USERS_V user records │
│ DECISIONS_V AI step decisions │
│ INVESTIGATION_STEP_EVENTS_V step lifecycle events │
│ ORGANIZATIONS_V org name + ID mapping │
│ │
└─────────────────────────────────────────────────────────────────┘
Object Reference
| Object | Schema | Type | Description |
|---|---|---|---|
RAW_EVENTS | RAW_CLEAN | Table | Raw ingest — one row per DMS event |
EVENTS_DEDUPED | RAW_CLEAN | Dynamic Table | Latest version per (SOURCE_TYPE, OBJECT_ID, REGION), refreshed every 1h |
ORGANIZATIONS_V | ANALYTICS | View | Maps auth_id → customer_id + org name |
AUDIT_V | ANALYTICS | View | Audit events — recordings, workflow runs, decisions |
WORKFLOWS_V | ANALYTICS | View | Workflow records |
WORKFLOW_GRAPHS_V | ANALYTICS | View | Workflow graph versions |
USECASES_V | ANALYTICS | View | Use case / template records |
USERS_V | ANALYTICS | View | User records (org_id, sub) |
DECISIONS_V | ANALYTICS | View | AI step decisions with recommendation and accuracy |
INVESTIGATION_STEP_EVENTS_V | ANALYTICS | View | Investigation step lifecycle events |
Exploring Data (Data Preview)
Before writing queries, use Snowflake's built-in preview to see what columns and values exist in a view.
- In the left sidebar, expand LEGION_ANALYTICS → ANALYTICS
- Click any view name (e.g.
AUDIT_V) - In the panel that opens on the right, click the Data Preview tab
- Sample rows with all columns and values are shown
Use this to confirm column names and value formats before writing a query.
Canonical Organization ID
Every org can be identified in two ways depending on the source:
| Field | Source |
|---|---|
auth_id | WorkOS auth system (e.g. org_01JM7E9VQ1S8Y7JWP64MGYYBAN) |
customer_id | Legion internal ID (e.g. org_HJ8em7FhxL7DhZEE) |
ORGANIZATIONS_V exposes a canonical_org_id field that resolves this automatically — it returns customer_id when a mapping exists, otherwise falls back to auth_id.
Always use canonical_org_id when joining across views. It is the stable, consistent identifier.
Find an org's canonical ID:
SELECT organization_name, canonical_org_id
FROM LEGION_ANALYTICS.ANALYTICS.ORGANIZATIONS_V
WHERE organization_name ILIKE '%acme%';
Filter another view using it:
SELECT * FROM LEGION_ANALYTICS.ANALYTICS.AUDIT_V
WHERE customer_id = '<canonical_org_id>'
LIMIT 50;
Querying
No version history. Snowflake does not version worksheets or dashboard tiles. Overwriting a query is permanent. Always develop and test in a new worksheet — only update a shared dashboard tile once you are confident the query is correct.
Open a Worksheet
- Click Worksheets in the left sidebar → + Worksheet
- Set context in the top bar: Database =
LEGION_ANALYTICS, Schema =ANALYTICS
Examples
List all orgs — start here to get canonical IDs for other queries
SELECT organization_name, canonical_org_id
FROM ORGANIZATIONS_V
ORDER BY organization_name;
All recordings for an org
SELECT created_at, type, status, usecase_id
FROM AUDIT_V
WHERE customer_id = '<canonical_org_id>'
AND type = 'recording'
ORDER BY created_at DESC;
All workflows for an org
SELECT workflow_id, usecase_id, created_at
FROM WORKFLOWS_V
WHERE customer_id = '<canonical_org_id>'
AND is_preview = false
AND deleted IS NULL
ORDER BY created_at DESC;
Workflow count across all orgs
SELECT o.organization_name, COUNT(*) AS workflows
FROM WORKFLOWS_V w
JOIN ORGANIZATIONS_V o ON w.customer_id = o.canonical_org_id
WHERE w.is_preview = false
AND w.deleted IS NULL
GROUP BY 1
ORDER BY 2 DESC;
Dashboards
Same warning applies — dashboard tiles have no version history. Test queries in a worksheet before editing a tile.
Create a New Dashboard
- Click Dashboards in the left sidebar → + Dashboard
- Name it → New Tile → New from Worksheet
- Write and run your query
- Click the chart icon (top right of results) to switch to a visualization
- Choose chart type, configure axes, then click Return to <Dashboard Name>
Add a Tile to an Existing Dashboard
- Open the dashboard → + Tile (top right) → New from Worksheet
Add Filters to a Dashboard
Filters let viewers slice all tiles by a value (e.g. filter by org) without editing any query.
- Open the dashboard → click the filter icon (top right, next to + Tile)
- Click + Filter and give it a display name (e.g.
Organization) - Set the type (Text, Date, Number, etc.)
- Click Done
Then wire each tile to the filter:
- Click the ... menu on a tile → Edit query
- In the query, replace the hardcoded value with the filter keyword using
:filter_namesyntax:WHERE customer_id = :organization - Run the query — Snowflake will prompt you to map
:organizationto the filter you created - Save and return to the dashboard — the filter now controls that tile
Wire the filter to every tile that should respond to it. Tiles not wired are unaffected.
Suggested Chart Types
| Metric | Chart |
|---|---|
| Workflow count over time | Line |
| Recordings per org | Bar |
| Time-to-workflow | Bar (sorted ascending) |
| Single KPI (e.g. total workflows) | Scorecard |
Maintaining the Schema
Adding a New Org Mapping
ORGANIZATIONS_V uses a hardcoded auth_id → customer_id map. When a new customer is onboarded:
- Open
snowflake/organization-view.sql - Add a row to the
org_mapVALUES block:('org_<AUTH_ID>', 'org_<CUSTOMER_ID>'), - Run the
CREATE OR REPLACE VIEWstatement in a Snowflake worksheet to apply the change
Infrastructure Reference
| Component | Detail |
|---|---|
| Pipe | PIPE_EVENTS — auto-ingests .json / .json.gz files from the stage |
| Stage | LEGION_ANALYTICS.RAW_CLEAN.CLEAN_LEGION_STAGE (S3/GCS) |
| Warehouse | ANALYTICS_WH (runs dynamic table refreshes) |
| Dedup lag | 1 hour — EVENTS_DEDUPED refreshes automatically on this cadence |